智能论文笔记

Using Machine Learning for Anomaly Detection on a System-on-Chip under Gamma Radiation

Eduardo Weber Wachter , Server Kasap , Sefki Kolozali , Xiaojun Zhai , Shoaib Ehsan , Klaus McDonald-Maier

分类：机器学习

2022-01-05

新的纳米级技术的出现对辐射环境中的可靠电子系统造成了重大挑战。少数种类的辐射等全电离剂量（TID）效应通常导致在这种纳米级电子设备上的永久性损坏，以及当前最先进的技术，以使用昂贵的辐射硬化装置。本文重点介绍了一种新颖且不同的方法：在消费者电子级现场可编程门阵列（FPGA）上使用机器学习算法来解决TID效果并在停止工作之前监控它们替换。这种情况有一个研究挑战，以期待电路板因TID效应而导致总失效。我们观察到γ辐射下FPGA板的内部测量，并使用了三种不同的异常检测机学习（ML）算法来检测伽马辐射环境中的传感器测量中的异常。统计结果表明伽马辐射曝光水平与板测量之间的高度显着关系。此外，我们的异常检测结果表明，具有径向基函数内核的单级支持向量机的平均召回得分为0.95。此外，在电路板停止工作之前，可以检测到所有异常。

translated by 谷歌翻译

Closed-Loop Data Transcription to an LDR via Minimaxing Rate Reduction

Xili Dai , Shengbang Tong , Mingyang Li , Ziyang Wu , Kwan Ho Ryan Chan , Pengyuan Zhai , Yaodong Yu , Michael Psenka , Xiaojun Yuan , Heung Yeung Shum

分类：计算机视觉

2021-11-12

这项工作提出了一种新的计算框架，用于学习用于真实数据集的明确生成模型。特别地，我们建议在包含多个独立的多维线性子空间组成的特征空间中的多类多维数据分发和{线性判别表示（LDR）}之间学习{\ EM闭环转录}。特别地，我们认为寻求的最佳编码和解码映射可以被配制为编码器和解码器之间的{\ em二手最小游戏的均衡点}。该游戏的自然实用功能是所谓的{\ em速率减少}，这是一个简单的信息定理措施，用于特征空间中子空间类似的高斯的混合物之间的距离。我们的配方利用来自控制系统的闭环误差反馈的灵感，避免昂贵的评估和最小化数据空间或特征空间的任意分布之间的近似距离。在很大程度上，这种新的制定统一了自动编码和GaN的概念和益处，并自然将它们扩展到学习多级和多维实际数据的判别和生成}表示的设置。我们对许多基准图像数据集的广泛实验表明了这种新的闭环配方的巨大潜力：在公平的比较下，学习的解码器的视觉质量和编码器的分类性能是竞争力的，并且通常比基于GaN，VAE或基于GaN，VAE或基于GaN，VAE的方法更好的方法两者的组合。我们注意到所以，不同类别的特征在特征空间中明确地映射到大约{em独立的主管子空间};每个类中的不同视觉属性由每个子空间中的{\ em独立主体组件}建模。

translated by 谷歌翻译

EDoG: Adversarial Edge Detection For Graph Neural Networks

Xiaojun Xu , Yue Yu , Hanzhang Wang , Alok Lal , Carl A. Gunter , Bo Li

分类：机器学习 | 人工智能

2022-12-27

Graph Neural Networks (GNNs) have been widely applied to different tasks such as bioinformatics, drug design, and social networks. However, recent studies have shown that GNNs are vulnerable to adversarial attacks which aim to mislead the node or subgraph classification prediction by adding subtle perturbations. Detecting these attacks is challenging due to the small magnitude of perturbation and the discrete nature of graph data. In this paper, we propose a general adversarial edge detection pipeline EDoG without requiring knowledge of the attack strategies based on graph generation. Specifically, we propose a novel graph generation approach combined with link prediction to detect suspicious adversarial edges. To effectively train the graph generative model, we sample several sub-graphs from the given graph data. We show that since the number of adversarial edges is usually low in practice, with low probability the sampled sub-graphs will contain adversarial edges based on the union bound. In addition, considering the strong attacks which perturb a large number of edges, we propose a set of novel features to perform outlier detection as the preprocessing for our detection. Extensive experimental results on three real-world graph datasets including a private transaction rule dataset from a major company and two types of synthetic graphs with controlled properties show that EDoG can achieve above 0.8 AUC against four state-of-the-art unseen attack strategies without requiring any knowledge about the attack type; and around 0.85 with knowledge of the attack type. EDoG significantly outperforms traditional malicious edge detection baselines. We also show that an adaptive attack with full knowledge of our detection pipeline is difficult to bypass it.

translated by 谷歌翻译

Understanding the Complexity Gains of Single-Task RL with a Curriculum

Qiyang Li , Yuexiang Zhai , Yi Ma , Sergey Levine

分类：机器学习

2022-12-24

Reinforcement learning (RL) problems can be challenging without well-shaped rewards. Prior work on provably efficient RL methods generally proposes to address this issue with dedicated exploration strategies. However, another way to tackle this challenge is to reformulate it as a multi-task RL problem, where the task space contains not only the challenging task of interest but also easier tasks that implicitly function as a curriculum. Such a reformulation opens up the possibility of running existing multi-task RL methods as a more efficient alternative to solving a single challenging task from scratch. In this work, we provide a theoretical framework that reformulates a single-task RL problem as a multi-task RL problem defined by a curriculum. Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies. We also show that our theoretical insights can be translated into an effective practical learning algorithm that can accelerate curriculum learning on simulated robotic tasks.

translated by 谷歌翻译

DDH-QA: A Dynamic Digital Humans Quality Assessment Database

Zicheng Zhang , Yingjie Zhou , Wei Sun , Wei Lu , Xiongkuo Min , Yu Wang , Guangtao Zhai

分类：计算机视觉

2022-12-24

In recent years, large amounts of effort have been put into pushing forward the real-world application of dynamic digital human (DDH). However, most current quality assessment research focuses on evaluating static 3D models and usually ignores motion distortions. Therefore, in this paper, we construct a large-scale dynamic digital human quality assessment (DDH-QA) database with diverse motion content as well as multiple distortions to comprehensively study the perceptual quality of DDHs. Both model-based distortion (noise, compression) and motion-based distortion (binding error, motion unnaturalness) are taken into consideration. Ten types of common motion are employed to drive the DDHs and a total of 800 DDHs are generated in the end. Afterward, we render the video sequences of the distorted DDHs as the evaluation media and carry out a well-controlled subjective experiment. Then a benchmark experiment is conducted with the state-of-the-art video quality assessment (VQA) methods and the experimental results show that existing VQA methods are limited in assessing the perceptual loss of DDHs. The database will be made publicly available to facilitate future research.

translated by 谷歌翻译

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Haoli Bai , Zhiguang Liu , Xiaojun Meng , Wentao Li , Shuang Liu , Nian Xie , Rongfu Zheng , Liangwei Wang , Lu Hou , Jiansheng Wei

分类：自然语言处理 | 计算机视觉

2022-12-19

Unsupervised pre-training on millions of digital-born or scanned documents has shown promising advances in visual document understanding~(VDU). While various vision-language pre-training objectives are studied in existing solutions, the document textline, as an intrinsic granularity in VDU, has seldom been explored so far. A document textline usually contains words that are spatially and semantically correlated, which can be easily obtained from OCR engines. In this paper, we propose Wukong-Reader, trained with new pre-training objectives to leverage the structural knowledge nested in document textlines. We introduce textline-region contrastive learning to achieve fine-grained alignment between the visual regions and texts of document textlines. Furthermore, masked region modeling and textline-grid matching are also designed to enhance the visual and layout representations of textlines. Experiments show that our Wukong-Reader has superior performance on various VDU tasks such as information extraction. The fine-grained alignment over textlines also empowers Wukong-Reader with promising localization ability.

translated by 谷歌翻译

Robust Saliency Guidance for Data-free Class Incremental Learning

Xialei Liu , Jiang-Tian Zhai , Andrew D. Bagdanov , Ke Li , Ming-Ming Cheng

分类：计算机视觉

2022-12-16

Data-Free Class Incremental Learning (DFCIL) aims to sequentially learn tasks with access only to data from the current one. DFCIL is of interest because it mitigates concerns about privacy and long-term storage of data, while at the same time alleviating the problem of catastrophic forgetting in incremental learning. In this work, we introduce robust saliency guidance for DFCIL and propose a new framework, which we call RObust Saliency Supervision (ROSS), for mitigating the negative effect of saliency drift. Firstly, we use a teacher-student architecture leveraging low-level tasks to supervise the model with global saliency. We also apply boundary-guided saliency to protect it from drifting across object boundaries at intermediate layers. Finally, we introduce a module for injecting and recovering saliency noise to increase robustness of saliency preservation. Our experiments demonstrate that our method can retain better saliency maps across tasks and achieve state-of-the-art results on the CIFAR-100, Tiny-ImageNet and ImageNet-Subset DFCIL benchmarks. Code will be made publicly available.

translated by 谷歌翻译

FlexiViT: One Model for All Patch Sizes

Lucas Beyer , Pavel Izmailov , Alexander Kolesnikov , Mathilde Caron , Simon Kornblith , Xiaohua Zhai , Matthias Minderer , Michael Tschannen , Ibrahim Alabdulmohsin , Filip Pavetic

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-15

Vision Transformers convert images to sequences by slicing them into patches. The size of these patches controls a speed/accuracy tradeoff, with smaller patches leading to higher accuracy at greater computational cost, but changing the patch size typically requires retraining the model. In this paper, we demonstrate that simply randomizing the patch size at training time leads to a single set of weights that performs well across a wide range of patch sizes, making it possible to tailor the model to different compute budgets at deployment time. We extensively evaluate the resulting model, which we call FlexiViT, on a wide range of tasks, including classification, image-text retrieval, open-world detection, panoptic segmentation, and semantic segmentation, concluding that it usually matches, and sometimes outperforms, standard ViT models trained at a single patch size in an otherwise identical setup. Hence, FlexiViT training is a simple drop-in improvement for ViT that makes it easy to add compute-adaptive capabilities to most models relying on a ViT backbone architecture. Code and pre-trained models are available at https://github.com/google-research/big_vision

translated by 谷歌翻译

IncepFormer: Efficient Inception Transformer with Pyramid Pooling for Semantic Segmentation

Lihua Fu , Haoyue Tian , Xiangping Bryce Zhai , Pan Gao , Xiaojiang Peng

分类：计算机视觉

2022-12-06

Semantic segmentation usually benefits from global contexts, fine localisation information, multi-scale features, etc. To advance Transformer-based segmenters with these aspects, we present a simple yet powerful semantic segmentation architecture, termed as IncepFormer. IncepFormer has two critical contributions as following. First, it introduces a novel pyramid structured Transformer encoder which harvests global context and fine localisation features simultaneously. These features are concatenated and fed into a convolution layer for final per-pixel prediction. Second, IncepFormer integrates an Inception-like architecture with depth-wise convolutions, and a light-weight feed-forward module in each self-attention layer, efficiently obtaining rich local multi-scale object features. Extensive experiments on five benchmarks show that our IncepFormer is superior to state-of-the-art methods in both accuracy and speed, e.g., 1) our IncepFormer-S achieves 47.7% mIoU on ADE20K which outperforms the existing best method by 1% while only costs half parameters and fewer FLOPs. 2) Our IncepFormer-B finally achieves 82.0% mIoU on Cityscapes dataset with 39.6M parameters. Code is available:github.com/shendu0321/IncepFormer.

translated by 谷歌翻译

Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation

Yang Zhao , Junnan Zhu , Lu Xiang , Jiajun Zhang , Yu Zhou , Feifei Zhai , Chengqing Zong

分类：自然语言处理

2022-12-06

A common scenario of Multilingual Neural Machine Translation (MNMT) is that each translation task arrives in a sequential manner, and the training data of previous tasks is unavailable. In this scenario, the current methods suffer heavily from catastrophic forgetting (CF). To alleviate the CF, we investigate knowledge distillation based life-long learning methods. Specifically, in one-tomany scenario, we propose a multilingual distillation method to make the new model (student) jointly learn multilingual output from old model (teacher) and new task. In many-to one scenario, we find that direct distillation faces the extreme partial distillation problem, and we propose two different methods to address it: pseudo input distillation and reverse teacher distillation. The experimental results on twelve translation tasks show that the proposed methods can better consolidate the previous knowledge and sharply alleviate the CF.

translated by 谷歌翻译